Classes for fast maximum entropy training

نویسنده

  • Joshua Goodman
چکیده

Maximum entropy models are considered by many to be one of the most promising avenues of language modeling research. Unfortunately, long training times make maximum entropy research difficult. We present a novel speedup technique: we change the form of the model to use classes. Our speedup works by creating two maximum entropy models, the first of which predicts the class of each word, and the second of which predicts the word itself. This factoring of the model leads to fewer nonzero indicator functions, and faster normalization, achieving speedups of up to a factor of 35 over one of the best previous techniques. It also results in typically slightly lower perplexities. The same trick can be used to speed training of other machine learning techniques, e.g. neural networks, applied to any problem with a large number of outputs, such as language modeling.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A maximum entropy semantic parser using word classes

This paper describes the parser that is used in the Sail Labs Conversational System, which is a spoken dialog system. This parser is a fully statistical, semantic parser. The probability model of the parser is based on the principle of maximum entropy. The maximum entropy framework allows to combine the available information in a fully automatic way, but the training of maximum entropy models i...

متن کامل

Training Procedures for Overlabelled Data

It is common in perception tasks to have overlabelled training observations. For example, demonstrated images are segmented into a number of classes; e.g., car, person, road, tree, etc.; but the desired classifier need only to distinguish between known groups of classes; e.g., obstacle (containing car, person and tree), or drivable. A simple solution is to train one’s favorite classifier on eit...

متن کامل

Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data

In this paper, we present an overview of generalized expectation criteria (GE), a simple, robust, scalable method for semi-supervised training using weakly-labeled data. GE fits model parameters by favoring models that match certain expectation constraints, such as marginal label distributions, on the unlabeled data. This paper shows how to apply generalized expectation criteria to two classes ...

متن کامل

CS229 Project Report: NER adaptation

For each transition probability, MEMM uses maximum entropy model. Maximum entropy model is used to model the data distribution, which gives us as much information as possible. Without any constraints, the uniform distribution gives us maximum entropy. However, we have training data which gives some facts about the true distribution. What maximum entropy model does is to maximize the entropy of ...

متن کامل

Efficient Maximum Entropy Training for Statistical Object Recognition

In statistical pattern recognition, we use probabilistic models within the task of assigning observations to one of a set of predefined classes, like e.g. images of handwritten digits to one of the classes ‘0’ to ‘9’. The principle of maximum entropy is a powerful framework that can be used to estimate class posterior probabilities for pattern recognition tasks. It is a conceptually simple and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001